How to Start Training: The Effect of Initialization and Architecture
نویسندگان
چکیده
We investigate the effects of initialization and architecture on the start of training in deep ReLU nets. We identify two common failure modes for early training in which the mean and variance of activations are poorly behaved. For each failure mode, we give a rigorous proof of when it occurs at initialization and how to avoid it. The first failure mode, exploding/vanishing mean activation length, can be avoided by initializing weights from a symmetric distribution with variance 2/fan-in. The second failure mode, exponentially large variance of activation length, can be avoided by keeping constant the sum of the reciprocals of layer widths. We demonstrate empirically the effectiveness of our theoretical results in predicting when networks are able to start training. In particular, we note that many popular initializations fail our criteria, whereas correct initialization and architecture allows much deeper networks to be trained.
منابع مشابه
The Explanation of effectiveness of student's lived experience in the architectural training process
Abstract: Architecture as a built environment has an important role in the quality of experience. Architecture experience is one of the educational strategies for knowing architecture. In some educational approaches, experience means observation, while based on the embodied cognition, environment experience is more than an observation. According to this approach, lived experience is deepest tha...
متن کاملInvestigating the effect of environmental thermal comfort components on students' cognitive performance based on the analysis of fatigue factor (study sample of architecture students of universities in Ilam)
Background and purpose: Human-made environments can have negative and positive effects on the planet. One of the two-faceted artifacts is the construction of buildings for people's lives, buildings that are being built irregularly and quickly and lead to excessive consumption of fossil resources and energy waste. become Therefore, one of the most important things in the design of a building is ...
متن کاملفتوتنامه معماری
At least two general systems of education can be identified in Iranian-Islamic architecture. One is the traditional teaching method of architecture which has centuries of precedent within which the transition of concepts was conducted through person to person and master-apprentice training; second, is the academic studying method in recent period, which is based on European schools’ patte...
متن کاملImpact of creativity training on fluid components, ingenuity, flexibility, expansion In hands-free architecture design training workshops
One of the fundamental and constructive features of human beings is creativity which plays an important role in the growth and development of human beings and human civilization. Researchers believe that creativity training is effective in enhancing it. The purpose of this study was to investigate the effect of teaching metacognitive components of creativity in hands-free design training worksh...
متن کاملDEVELOPING A NEW INITIALIZATION PROCEDURE FOR DISTILLATION COLUMN SIMULATION
The simulation of distillation columns is an essential step in design, optimization, and rating. In this paper, a new procedure has been proposed for the initial estimation of column profiles based on modified Kremser’s group method for simple and/or complex columns. The effect of this initialization algorithm on simulation procedure has been studied through two examples. The results show sig...
متن کامل